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20.  Abstract  (continued) 

<n, 

Future  research  and  implications  for  using  these  regression  techniques  in  testing 
behavioral  models  are  discussed. 


Aba tract 


Six  methods  of  estimating  regression  weights  for  a  linear  model  of 

behavior  were  compared  in  51  samples  of  National  Guardsmen.  Ordinary  least 

squares,  Bayesian  m-group  regression,  ridge  regression,  equal  weighting,  and 

two  related  methods  were  used.  Weights  were  estimated  in  one-half  of  each 

sample  and  then  applied  to  data  in  the  other  half.  Ratios  of  observations 
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to  predictors  ranged  from  4:1  to  19:1.  Cross  validation  r  was  used  as  the 
index  of  model  or  equation  stability.  Results  support  earlier  findings  that 
least  squares  weights  are  relatively  unstable  in  small  samples,  but  do  not 
indicate  the  superiority  of  any  one  other  method.  Future  research  and 
implications  for  using  these  regression  techniques  in  testing  behavioral 
models  are  discussed. 


Linear  equations  frequently  appear  as  descriptions  of  human  behavior. 
A  criterion  variable  is  viewed  as  an  additive  function  of  one  or  more 
predictor  variables.  The  differential  weighting  of  the  standardized 
predictor  variables  is  often  assumed  to  convey  the  relative  importance  of 
these  variables  in  explaining  the  criterion.  However,  when  differential 
weights  are  generated  statistically  from  a  sample  of  observations,  their 
properties  must  be  understood  before  strong  interpretations  can  be  made. 
The  purpose  of  this  study  is  to  examine  empirically  several  methods  of 
statistically  computing  weights. 

Multiple  regression  has  certainly  been  the  most  popular  method  used  to 
estimate  weights  in  the  linear  model.  Despite  warnings  about  the  sampling 
error  of  ordinary  least  squares  regression  weights  (OLS)  (Wainer,  1976, 
1978)  and  interpretative  ambiguity  when  predictors  are  correlated 
(Darlington,  1968;  Johnston,  1972),  many  researchers  continue  to  base 
interpretations  of  results  on  the  relative  size  of  these  weights.  Johnston 
(1972)  provides  three  reasons  why  interpretations  must  be  qualified  when  the 
predictor  variables  are  correlated.  First,  it  is  difficult  to  determine  the 
relative  influence  of  various  predictors.  Second,  an  investigator  may  drop 
a  potentially  interesting  and  useful  variable  from  further  consideration 
because  its  OLS  regression  weight  is  not  significantly  different  from  zero. 
Finally,  the  estimates  of  the  weights  become  very  sensitive  to  particular 
samples  of  data. 

When  an  Investigator  tests  a  model  in  (or  generates  a  model  from!) 
only  one  sample  of  data,  two  interrelated  issues  must  be  considered  before 
assuming  that  OLS  provides  the  "best"  weights.  First,  the  sampling  error  of 
the  weights  must  temper  direct  Interpretation  of  the  weights.  Sampling 
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error  is  negatively  related  to  sample  size  and  positively  related  to  the 
intercorrelations  among  predictors.  Small  samples  and  correlated  predictors 
are  quite  common  in  applied  psychological  research.  Second,  theoretical 
interpretations  of  the  OLS  equation  implies  a  decision  concerning  an  index 
of  predictor  "importance."  As  Darlington  (1968)  points  out,  unless  the 
predictors  are  mutually  uncorrelated,  this  is  a  highly  ambiguous  matter. 


Beta  weights  (standardized  regression  weights)  may  be  only  vaguely  related 


to  predictor  validity  (r  ,),  and  heavily  influenced  by  the  other  variables 

yxi 

in  the  regression  equation. 


In  pure  prediction  situations,  validity  of  regression  equations  is 

2 

often  based  on  the  associated  multiple  correlation.  The  effect  of  sampling 


error  on  the  multiple  correlation  coefficient  can  be  demonstrated  through 


the  application  of  various  shrinkage  formulas  (Drasgow,  Dorans  and  Tucker, 
1979;  Schmitt,  Coyle,  and  Raushenburger ,  1977;  Wherry,  1931)  and  cross 
validation  (Dunnette,  1966).  The  latter  method  may  give  an  underestimate  of 
the  long  run  or  population  cross  validity  (Schmidt,  1971)  but  does  provide 
the  researcher  with  an  estimate  of  shrinkage  for  specific  situations  and 


purposes. 

Wainer  (1976,  1978)  has  suggested  that  the  shrinkage  associated  with 
OLS  is  so  undesireable  that,  uncer  many  circumstances,  OLS  should  be 


abandoned  as  a  means  for  establishing  weights  in  the  linear  model.  Rather, 


he  states  that  the  unbiased  properties  of  OLS  do  not  practically  reduce  the 
mean  squared  error  (MSE)  of  prediction  over  many  other  biased  weighting 
strategies.  In  particular,  Wainer  (1976)  indicates  that  the  equal  weighting 
(EW)  of  all  predictor  variables  results  in  an  equation  that  is  robust  to 
sampling  differences  and  in  many  cases  has  a  MSE  that  is  only  slightly 


higher  than  OLS  in  the  derivation  sample. 


Some  previous  research  has  compared  the  predictive  stability  of  EW  as 

an  alternative  weighting  scheme  to  OLS.  In  early  studies,  the  superiority 

of  EW  over  OLS  was  demonstrated  in  samples  of  blue  collar  workers  (Trattr.er, 

1963)  and  freshman  engineering  students  (Lawshe  and  Shucker,  1959).  Dawes 

and  Corrigan  (1974)  showed  that  EW  provided  a  linear  composite  that  was  more 

valid  than  OLS  in  describing  both  the  process  of  decision  making  and  its 

validity.  Recently,  several  investigators  (Dorans  &  Drasgow,  1978;  Einhorn 

&  Hogarth,  1975;  Schmidt,  1971)  have  used  Monte  Carlo  procedures  to  give 

good  indications  of  when  equal  weights  are  superior  to  OLS  weights.  When 

the  sample  sizes  (N)  are  small  relative  to  the  number  of  predictors  (n), 

[N=25»  n=2  (Schmidt,  1971);  N=50,  n=4  (Einhorn  and  Hogarth,  1975);  N=30, 

N=60,  n=ll  (Dorans  and  Drasgow,  1978)],  equal  weights  are  clearly  superior 

to  OLS  when  cross  validated  in  the  population  or  samples  of  equal  size. 

Einhorn  and  Hogarth  (1975)  also  noted  that  as  the  criterion  (or  dependent) 

variable  becomes  more  ambiguous,  equal  weights  are  superior  to  both  OLS  and 

rational  (clinical)  weights.  In  attempting  to  recover  population  regression 

weights  from  sample  data,  which  is  directly  relevant  for  model  testing, 

Dorans  and  Drasgow  (1978)  found  that  with  N=30  and  N=60  (with  n=11),  equal 

weights  had  a  higher  congruence  coefficient  (Korth  and  Tucker,  1975)  with 

the  population  weights  than  OLS.  Another  interesting  finding  by  the  same 

authors  was  that  the  congruence  coefficients  for  OLS  and  EW  were 

2 

approximately  equal  at  N=120,  though  EW  had  the  higher  cross  validation  r  . 
They  also  found  that  the  pattern  of  EW  superiority  was  less  obvious  when  the 
oommunality  of  the  criterion  was  low  in  the  population  and  when  the 
prespecified  factor  model  provided  only  a  moderate  fit  to  the  population 
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covariance  matrix. 

Overall,  the  Monte  Carlo  results  do  point  to  the  efficiency  of  equal 
weights  in  small  samples  when  randomly  drawn  from  multivariate  normal 


populations.  The 

interpretation  and 

generalization 

of 

these 

results 

to 

applied  settings 

depends  on  the 

tenability 

of 

the 

assumptions. 

For 

instance,  Schmidt 

(1971)  suggested  that  in  20t 

of 

the 

data 

samples 

in 

applied  psychology,  there  are  significant  departures  from  the  assumptions  of 

the  regression  model;  linearity  of  regression,  normality,  and  homogeneity  of 

conditional  variance.  Both  he  and  Einhorn  and  Hogarth  (1975)  interpreted 

2 

their  results  as  conservative  estimates  of  the  difference  between  OLS  r 

cv 

2 

and  EW  r  ^ 

Based  on  these  results,  it  is  tempting  to  endorse  equal  weighting  as 
the  desired  method  in  many  common  research  situations.  However,  equal 
weighting  has  serious  drawbacks  as  a  method  for  testing  behavioral  models. 
Model  testing  in  field  settings  depends  on  the  ability  to  assign 
differential  weights  to  predictor  variables  based  on  empirical  observation 
and  statistical  theory.  As  Darlington  (1978)  has  noted,  equal  weights  are 
not  based  on  any  such  theory. 

In  addition  to  equal  weights,  there  have  been  numerous  suggestions  for 
alternatives  to  OLS  estimates  of  regression  weights.  This  paper  considers  2 
alternatives,  ridge  regression  and  Bayesian  m-group  regression.  (See  Dorans 
&  Drasgow,  1978  for  a  description  and  Monte  Carlo  comparison  of  some 
others. ) 

Several  authors  (Darlington,  1978;  Hoerl  and  Kennard,  1970;  Price, 
1977;  Theobold,  1971*)  have  recommended  the  use  of  ridge  regression  as  an 
approach  to  reducing  the  ambiguity  in  sample  estimates  of  population  weights 
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among  correlated  predictors.  The  object  is  to  obtain  biased  estimators  of 
the  true  weights  that  have  smaller  sampling  variances  than  OLS  estimates. 
Since  these  authors  have  all  offered  descriptions  of  the  method,  the  present 
one  will  be  brief.  When  computing  standardized  OLS  weights  from  a 
correlation  matrix,  the  following  formula  can  be  used: 

b  =  (X'X)_1X'Y 

where  b  is  the  vector  of  standardized  regression  weights,  X'X  is  the 
correlation  matrix  for  the  predictors,  and  X'Y  is  the  vector  of  correlations 
between  predictors  and  criterion.  In  ridge  regression,  a  small  positive 
value  OO  is  added  to  the  diagonal  elements  of  the  X'X  matrix.  The  formula, 

b(Vt)  =  (X’X  +  SO"1X’X, 

where  K  is  the  vector  of  small  positive  values  that  introduces  the  bias  into 
the  regression  estimates  (b ( K) ) ,  has  the  positive  effect  of  reducing 
sampling  variance.  The  larger  the  values  in  K,  the  smaller  the  sampling 
variance.  However,  as  K  gets  large,  the  elements  of  the  weight  vector  b(K) 
approach  zero.  Hoerl  and  Kennard  (1970)  and  Theobold  (197*0  demonstrated 
theoretically  that  there  exists  a  vector  of  elements  in  K  such  that 

E[ (b(K)  -  B)'(b(K)  -  B)]  <  EC (b  -  B) ' (b  -  B)] 

where  b  is  the  vector  of  OLS  sample  weights  and  B  is  the  vector  of 
population  OLS  weights.  The  biased  estimators,  b(K)  have  better  mean 
squared  error  properties  in  the  long  run  than  do  OLS  estimators  if  the 
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squared  bias  is  less  than  or  equal  to  the  reduction  in  sampling  variance. 
There  has  been  no  evidence  regarding  the  optimal  method  for  determining  K. 

Price  (1977)  applied  ridge  regression  to  the  problem  of  predicting  Job 
satisfaction  from  a  set  of  5  leadership  variables  that  were  linearly  related 
in  a  sample  of  30  observations.  His  results  suggest  that  the  method  does 
eliminate  negative  beta  weights  when  it  is  likely  that  the  negative  OLS 
estimates  are  due  to  sampling  error.  No  evidence  was  presented  supporting 
the  stability  of  the  ridge  regression  weights. 

Bayesian  ra-group  regression,  another  alternative  to  least  squares 
regression,  has  been  proposed  as  a  solution  to  the  problem  of  estimating 
regression  coefficients  in  possibly  different  groups  when  the  groups  are 

small  (Jackson,  Novlck,  and  Thayer,  1971;  Novick,  Jackson,  Thayer,  and  Cole, 

1972).  The  objective  is  to  improve  prediction  in  the  1th  group  by  using 

information  from  the  other  m-1  groups.  Prior  information  for  the  Bayesian 
regression  is  based  on  a  set  of  assumptions  regarding  the  nature  of  the 
regression  weights.  In  the  ordinary  least  squares  model,  (the  vector  of 
weights),  is  fixed  for  the  ith  group.  In  this  Bayesian  model,  B1  is  assumed 
to  be  randomly  sampled  from  a  multivariate  normal  distribution.  Prior 
beliefs  about  the  regression  weights  are  then  expressed  by  specifying  values 
for  the  parameters  of  the  B.^  distribution,  and  the  weight  to  be  given  to 
these  prior  estimates.  In  applying  m-group  regression,  Novick,  et  al. 

(197 2)  have  assumed  that  only  minimal  knowledge  exists  about  this 
distribution  of  B^ .  Therefore,  relatively  little  weight  is  given  to  the 
estimated  prior  as  compared  to  Information  from  new  group-specific  data. 

These  priors,  the  least  squares  regression  weights  within  each  group, 
and  the  number  of  observations  in  each  group,  systematically  affect  the 

i 
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estimated  weights.  The  specification  of  a  large  population  variance  of  the 
regression  weights  indicates  that  more  heterogeneity  is  believed  to  exist 
among  the  groups.  Therefore  the  m-group  regression  weights  will  be  similar 
to  the  OLS  weights.  In  an  extreme  case  where  the  groups  are  believed  to  be 
identical,  the  population  variances  will  be  zero  and  B  will  be  identical 
for  all  groups.  Group  size  also  affects  the  estimated  B  .  For  a  given  set 
of  population  weight  variances,  the  number  of  observations  in  the  ith  group 
is  inversely  related  to  the  effect  of  the  data  from  the  other  m-1  group  on 
the  regression  within  the  iCh  group. 

In  summary,  theory  testing  and  model  building  in  field  settings  have 
depended  heavily  on  the  use  of  linear  equations.  Previous  research  has 
failed  to  provide  a  psychologically  acceptable  solution  for  statistically 
estimating  differential  weights  in  samples  of  data  that  are  similar  to  those 
often  encountered  in  field  settings  (limited  number  of  observations  and 
correlated  predictors). 

The  commonly  used  method  of  least  squares  regression  provides  unbiased 
estimates  of  population  weights,  but  they  are  difficult  to  interpret  and 
have  low  stability  when  estimated  and  applied  in  small  samples.  Equal 
weighting  provides  a  more  robust  prediction  model  for  most  situations  but 
lacks  the  ability  to  integrate  data  and  theory.  Recently,  Bayesian  m-group 
regression  and  ridge-regression  have  been  offered  as  alternatives  that  may 
provide  this  Integrative  ability  without  sacrificing  predictive  stability. 

However,  very  little  empirical  research  has  actually  compared  these 
methods  to  the  more  traditional  methods  in  samples  that  are  drawn  from  field 
settings.  The  current  study  conducts  such  a  comparison  across  multiple 
samples  of  various  sizes.  Though  not  providing  concrete  answers  to 
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questions  about  relative  stability  in  different  sample  sizes,  the  results 
have  a  great  deal  of  generalizeability  due  to  the  collection  of  data  from 
numan  respondents.  Several  trends  can  be  expected.  First,  m-group 
regression  uses  data  from  more  than  one  group.  This  should  lead  to  greater 
stability  than  either  least  squares  or  ridge  regression  which  use  only 
group-specific  data.  Second,  past  evidence  has  suggested  that  equal 
weighting  is  superior  to  many  other  methods  in  small  samples  (e.g.,  Dorans 
and  Drasgow,  1978).  Therefore,  it  is  expected  to  out-perform  least  squares 
and  ridge  regression  because  the  latter  two  are  susceptable  to  sampling 
error.  Bayesian  m-group  regression  is  expected  to  have  greater  stability 
than  all  other  methods  because  it  considers  both  group-specific  data  as  well 
as  data  from  other  groups.  It  allows  for  sample  differences  while  using 
other  data  to  add  stability. 


Sample 

The  data  for  this  study  were  selected  from  a  total  sample  of  2079 
personnel  J-  60  units  of  the  Illinois  Army  National  Guard.  The  initial  data 
are  arbitrarily  divided  into  Wave  1  (29  units,  N=1169)  and  Wave  2  (31  units, 
N=910)  corresponding  to  the  time  of  data  collection.  The  sample  sizes  of 
the  units  range  from  N=2  to  N=80.  Other  analyses  of  these  same  data  are 
reported  in  Horn  and  Hulin  (Note  1)  and  Katerburg  and  Hulln  (Note  2). 

Each  National  Guard  unit  served  as  a  sample  of  observations.  It  was 
felt  that  the  objective  differences  among  the  units  (Katerburg  and  Hulin, 
Note  2)  provided  a  rationale  for  hypothesizing  that  all  observatibns  were 
not  drawn  from  the  same  population  (i.e.,  the  institutions  are  not 
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identical).  The  Bayesian  ra-group  regression  explicitly  allows  this  prior 
belief  to  be  incorporated  into  the  estimation  of  weights  for  each  unit. 

In  order  to  maintain  a  reasonably  large  sample  of  units  and  still 
retain  enough  observations  to  estimate  weights  in  each  unit,  only  units  with 
N  >  16  observations  were  included  to  estimate  weights  for  two  predictors 
(n=2).  All  29  units  were  retained  from  Wave  1  and  22  units  were  retained 
from  Wave  2.  The  data  from  each  of  these  51  units  were  randomly  divided 
into  a  derivation  sample  and  a  cross-validation  sample.  The  2  predictor 
weights  were  estimated  in  the  derivation  sample  and  then  applied  to  the  data 
in  the  cross-validation  sample. 

Assessment  of  Predictor  Variables 

Two  sets  of  2  predictor  variables  were  chosen,  one  for  the  Wave  1  units 
and  one  for  the  Wave  2  units.  The  first  set  was  chosen  such  that  the  two 
variables  were  highly  correlated.  This  should  drastically  increase  the 
sampling  error  of  the  least  squares  weights.  The  Consideration  scale  from 
the  Leader  Behavior  Description  Questionnaire  (Stogdill  4  Coons,  1963)  and 
the  Satisfaction  with  Supervisor  scale  from  the  Job  Descriptive  Index,  JDI 
(Smith,  Kendall,  and  Hulin,  1969)  were  chosen  because  the  median 
intercorrelation  was  .72  in  the  29  units  from  Wave  1.  These  variables  were 
used  as  predictors  in  the  Wave  1  unlt3. 

In  the  second  wave,  the  objective  was  to  choose  2  predictors  that 
consistently  produced  the  highest  multiple  correlation  with  the  criterion. 
The  JDI  Satisfaction  with  Work  scale  and  JDI  Satisfaction  with  Coworkers 
scale  were  used  as  predictors  in  the  Wave  2  units  (median  R  =  .56). 
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A33e?9BeflLt  pf  Aha  Criterion  Varlams 

A  criterion  variable  represents  a  behavior  (or  set  of  behaviors)  that 
is  theoretically  related  to  the  predictor  variables.  Behavioral  intention 
to  reenlist  (hereafter  referred  to  as  behavioral  intention)  was  chosen 
because  it  is  related  conceptually  to  withdrawal  decisions,  and  reenlistment 
data  were  not  available  for  all  respondents.  Behavioral  intention  was 
measured  by  a  single  item  with  a  seven  point  verbally  anchored  response 
scale . 

The  predictive  validity  of  this  item  was  estimated  in  a  sample  of  255 
respondents  for  whom  data  were  available.  The  correlation  between  responses 
to  this  item  and  actual  reenlistment  was  .70  (Horn  and  Uulin,  Note  1). 

In  summary,  the  Wave  1  model  included  LBDQ-Consideration  and 
JDI-Supervisor  to  account  for  variance  in  behavioral  intention.  The  Wave  2 
model  included  JDI-Work  and  JDI-Coworkers  to  account  for  variance  in 
behavioral  intention. 

Eniaatlpn  of  Wslahta 

Weights  for  the  two  predictor  variables  were  computed  in  the  derivation 
samples  for  six  alternate  weighting  schemes.  The  first  four  schemes  have 
been  described  earlier.  These  are  ordinary  least  squares  (OLS),  equal 
weights  (EW),  Bayesian  m-group  regression  (BAY),  and  ridge  regression 

3 

(RIDGE)  .  Two  additional  methods  serve  as  baselines  for  comparative 
purposes.  First,  ordinary  least  squares  weights  were  computed  from  all  data 
available  in  the  derivation  samples.  As  a  model  this  scheme  (OLS-Total) 
assumes  that  all  units  are  identical  and  that  the  larger  number  of 
observations  adds  to  the  stability  of  the  weights.  The  final  scheme  is 
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derived  from  the  Bayesian  methodology  and  is  called  the  generalized  weight 
equation  (GWE). 

Cross-validation  involved  transforming  these  weights  to  raw  score 

weights.  Dorans  and  Drasgow  (Note  3)  have  suggested  that  this 

transformation  is  necessary  for  all  cross  validation  studies  because  it  does 

not  make  the  possibly  erroneous  assumption  that  the  ratios  of  standard 
SD 

Y 

deviationgj—  will  not  change  from  derivation  to  cross  validation.  In  order 
xi 

to  cross  validate  standardized  weights,  they  should  be  transformed  back  to 
the  original  metric  and  applied  to  the  cross  validation  variance-covariance 
matrix. 


For  both  GWE  and  OLS-Total,  the  scalar  for  this  transformation  was 


SD 

111 

SD., 


where  3D„  and  SDX  are  based  on  all  of  the  data  in  the  derivation  samples, 
i 

This  is  consistent  with  the  assumptions  that  1)  no  prior  unique  data  exists 
for  the  cross-validation  samples  and  2)  all  units  are  identical  and  the 
greater  number  of  observations  gives  a  more  stable  estimate  of  the  standard 
deviation  in  each  unit.  For  the  other  4  schemes  SD^  and  SDX  were  based  on 
the  data  from  the  derivation  sample  of  each  unit  individually.  These  raw 
score  weights  were  then  applied  to  the  cross  validation  variance-covariance 
matrix  (C^cv^)  according  to  the  following  formula: 

,2 

'•"n 

2 


(bd 


(1) 


"cv 


s‘  .  <wr  c,  Nw_. 

Y(cv)  D  XX (cv)  D) 


where  is  the  vector  of  covariances  between  the  two  predictors  and 

the  criterion,  Cxx(cv)  is  the  variance-covariance  matrix  for  the  two 

predictors,  «D  is  the  raw  score  weight  vector  computed  in  the  derivation 
2 

sample  and  S^cvj  is  the  variance  of  the  criterion  in  the  cross  validation 
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sample. 


Res  alts 


Wave  1 

Table  1  shows  the  total  sample  sizes  and  standard  deviations  of 
JDI-Supervisor  ( X ^ ) ,  LBDQ-Consideration  (X, ),  and  behavioral  intention  (Y) 
in  both  the  derivation  sample  and  cross  validation  sample  for  29  first  wave 
Mbits.  The  total  samples  reported  in  Table  1  were  divided  into  two  samples 
of  approximately  equal  size  for  derivation  and  cross  validation  of  weights. 
In  each  unit  the  small  differences  between  derivation  and  cross  validation 
samples  in  standard  deviation  is  best  explained  as  sampling  error.  On  the 
other  hand,  the  somewhat  larger  differences  across  the  29  derivation  sample 
standard  deviations  (the  lowest  to  highest  values  are  .33  to  2.88  for 
behavioral  intention,  3.73  to  13*56  for  JDI  Supervisor,  and  H.62  to  18.13 
for  LBDQ-Consideration)  support  the  hypothesis  that  they  were  drawn  from 
different  populations.  As  would  be  expected,  there  are  also  large 
differences  across  units  in  the  cross  validation  samples. 

Table  2  presents  the  standardized  weights  for  the  two  predictor 

variables  from  each  of  the  six  weighting  schemes  for  all  29  units.  These 

weights  are  typically  used  in  interpreting  regression  equations.  The  choice 

of  1.0  as  a  value  for  the  equal  weighting  is  arbitrary.  As  equal 

standardized  weights,  any  value  could  be  used  for  the  present  study.  When 

2 

the  cross  validation  r  is  computed,  the  variance  of  the  predictor  composite 
is  irrelevant.  The  OLS  scheme  yields  the  following  results.  In  12  units, 
a  negative  weight.  In  7  units, 


LBDQ  receives 


JDI-Supervisor  receives  a  larger  weight  (absolute  value)  in  14  units  and 
LBDQ-Consideration  receives  the  larger  weight  in  the  other  15. 

As  for  the  alternate  soheoes,  Bayesian  ra-group  regression  eliminates 
all  negative  weights  for  both  variables.  JDI-Supervisor  receives  the  larger 
weight  in  17  units  and  the  weights  are  equal  in  one  unit.  Ridge  regression 
also  eliminates  all  negative  regression  weights.  In  the  I1*  units  that  had 
ridge  estimates,  LBDQ-Consideration  had  the  larger  weight  in  6  units  and 
JDI-Supervisor  had  the  larger  weight  in  the  other  8  units.  The  two  baseline 
methods,  GWE  and  OLS-Total,  give  positive  and  virtually  equal  weights  to 

both  variables. 

2 

vhe  rcv  for  the  six  weighting  schemes  in  the  29  units  appears  in  Table 
2 

3.  A  negative  rcy  reflects  the  fact  that  the  rcv  was  less  than  0.  In 
other  words,  the  variables  were  weighted  in  the  opposite  direction.  The 
signs  are  retained  in  the  Table  3  to  illustrate  this  effect.  With  OLS,  5 
units  had  a  rcv  of  less  than  0.  For  the  other  5  methods,  only  one  unit  (No. 
5)  had  a  negative  rcy. 

2 

In  averaging  these  statistics  across  the  29  units,  negative  rcv  was 

2  -2 

considered  a  0.  This  change  increases  the  average  rcv  ,  (r  )  for  OLS, 

-2 

and  yet  the  method  still  had  the  lowest  rcy  across  the  29  units.  Somewhat 

-  2 

surprisingly,  there  was  very  little  difference  among  the  r^  from  the  other 

4  weighting  schemes.  No  statistics  exist  to  compute  the  statistical 

-2 

significance  of  differences  among  the  rcy. 

If  only  the  14  units  with  RIDGE  weights  are  considered,  RIDGE  gives  the 
2 

highest  rcy  while  OLS  gives  the  lowest.  Again  it  is  difficult  to  assess  the 
stability  of  these  differences. 
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Maxe.i 

For  this  wave,  Table  4  presents  the  total  unit  sample  sizes,  standard 
deviation  for  the  two  predictors,  JDI-Work  (X^)  and  JDI-Coworkers  (X^),  and 
behavioral  intention  (Y)  in  both  derivation  and  cross  validation  samples. 
Again,  there  is  a  fairly  wide  range  of  standard  deviations  for  all  three 
variables  across  the  22  units. 

Table  5  presents  the  standardized  weights  derived  from  the  six  methods. 
By  design,  the  predictors  were  not  as  highly  correlated  as  in  Wave  1.  This 
may  explain  the  OLS  results  which  are  more  consistent  for  JDI-Work  (X^). 
The  only  negative  weight  for  this  variable  appears  in  unit  7.  On  the  other 
hand,  JDI-Coworkers  (X^)  received  a  negative  weight  in  12  units.  In  9 
units,  both  variables  received  positive  weights.  JDI-Work  received  the 
higher  weight  (absolute  value)  in  18  units. 

Bayesian  m-group  regression  eliminated  the  negative  weight  for  JDI-Work 
in  unit  7.  In  11  units,  BAY  yielded  a  negative  weight  for  JDI-Coworkers. 
In  21  of  the  22  units,  the  Bayesian  weight  for  JDI-Work  was  larger  (absolute 
value)  than  that  for  JDI-Coworkers.  RIDGE  resulted  in  the  negative  weight 
for  JDI-Work  in  unit  7  while  yielding  a  negative  weight  for  JDI-Coworkers  in 
3  units.  In  14  units,  JDI-Work  received  the  larger  weight.  The  baseline 
methods  gave  a  positive  weight  for  JDI-Work  and  small  negative  weights  for 
JDI-Coworkers. 

Again,  these  weights  were  transformed  using  the  same  scalers  as  in  Wave 

1.  The  weights  were  then  applied  to  the  cross  validation 

2 

variance-covariance  matrix  according  to  (1).  Table  6  shows  the  rcy  for 

2 

RIDGE  in  17  units  and  the  r  for  the  other  5  methods  in  all  22  units.  OLS 

cv 

yielded  3  small  negative  rcy.  BAY,  RIDGE,  OWE,  and  OLS-Total  resulted  in 


small  negative  r  for  2  units, 
cv 

On  the  average,  OLS  and  EQUAL  did  somewhat  poorer  than  the  other  3 

methods  in  this  wave.  Considering  only  the  17  units  where  RIDGE  weights 

were  computed,  the  3  methods  that  use  the  data  from  all  of  the  units  (BAY, 

-2 

GWE,  and  OLS-Total)  demonstrated  a  slightly  higher  rcv  .  Surprisingly,  EW 
showed  the  poorest  stability. 


DISCUSSION 

These  weighting  schemes  were  oomputed  as  models  of  human  behavior.  For 

the  current  predictor  variables  (facets  of  satisfaction  and  leadership), 

positive  weighting  of  both  is  consistent  with  previous  research  and  theory 

that  indicates  that  they  are  negatively  related  to  organizational  withdrawal 

(Fleishman  and  Harris,  1962;  Smith  et  al.,  1969). 

The  six  computational  methods  yielded  different  patterns  of  weights. 

First,  under  the  condition  of  highly  correlated  predictors,  least  squares 

weights  predictably  vacillated  between  positive  and  negative  values.  The 

Inflated  sampling  error  of  these  weights  was  demonstrated  in  the  lower  cross 
2 

validation  r  associated  with  them.  The  Bayesian  and  ridge  regression 
methods  eliminated  the  negative  weights,  but  only  ridge  regression  showed 
improved  stability.  Interpretation  of  this  latter  result  must  be  qualified 
by  the  non-random  selection  of  the  I1)  units  that  may  favor  ridge  regression. 

In  the  second  wave  of  data,  the  least  squares  weight  for  JDI-Work  was 
consistently  positive  (except  for  one  unit).  In  contrast  to  the  first  wave, 
this  result  Indicates  the  increased  stability  of  the  sign  of  an  Individual 
weight  when  the  predictors  are  correlated  to  a  lesser  degree.  The  ridge 
regression  procedure  eliminated  more  negative  weights  on  the  second. 


17 


This  result  Indicates  the  Increased  stability  of  the  sign  of  an  individual 
weight  when  the  predictors  are  correlated  to  a  lesser  degree.  The  ridge 
regression  procedure  eliminated  more  negative  weights  on  the  second 
variable,  JDI-Coworkers ,  than  did  the  Bayesian  method.  However,  the 
baseline  methods  indicate  that  a  small  negative  weight  may  be  realistic  for 
the  second  variable  in  many  samples. 

As  for  the  stability  of  the  models,  the  cross  validation  results  from 
this  study  showed  mixed  support  for  the  hypothesized  trends.  Surprisingly, 
the  far  greater  complexity  and  computing  time  necessary  to  estimate  the 
Bayesian  weights  did  not  pay  off  in  greater  stability  in  these  data.  It 
appears  that  m-group  regression  may  not  be  a  cost-effective  method  of 
estimating  weights  under  conditions  similar  to  those  found  in  the  present 
study.  In  the  first  wave  of  units,  the  m-group  regression  stability  was 
only  as  good  as  that  of  equal  weighting.  In  Wave  2,  it  was  equivalent  to 
OLS-Total,  indicating  that  its  increase  stability  over  least  squares  may  be 
due  more  to  its  use  of  more  data,  than  to  its  methods.  OLS-Total  is  a  much 
less  expensive  method  of  obtaining  stable  and  possibly  different  weights. 

Ridge  regression  did  show  some  promise  in  the  first  wave  where  the 
predictors  were  more  highly  correlated.  In  the  14  units,  its  stability 
exceeded  that  of  the  other  methods.  Its  limitation  is,  of  course,  its 
Inappropriateness  for  the  other  15  units.  The  investigator  has  the  option 
of  resorting  to  least  squares  weights  when  ridge  constants  can  not  be 
computed.  If  this  were  done  in  the  present  data,  ridge  regression  stability 
would  have  exceeded  that  of  least  squares  on  the  29  units,  but  fallen  short 
of  that  of  the  other  4  methods.  It  does  have  the  advantage  of  needing  less 
data  than  BAY,  OWE,  or  OLS-Total. 
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As  for  the  general lzeability  of  tills  study,  the  data  were  highly 
characteristic  of  field  studies  in  general.  Small  sample  sizes,  relatively 
low  criterion  communality,  and  limited  numbers  of  samples  are  very  common  to 
the  study  of  behavior  in  field  settings.  It  would  have  been  gratifying  to 
discover  a  panacea  to  the  weights  estimation  problem  under  these 
circumstances.  Unfortunately,  the  results  cf  this  study  suggest  that  this 
solution  has  not  yet  been  achieved.  The  most  that  can  be  concluded  from  the 
current  study  is  that  least  squares  provides  somewhat  poorer  estimates  than 
other  methods. 

One  major  limitation  was  that  only  two  predictor  variables  were 
included  in  each  equation.  As  more  predictors  are  added,  the  estimation  and 
interpretation  of  weights  becomes  more  complex.  When  possible,  it  would  be 
interesting  to  study  the  performance  of  these  weighting  schemes  with  more 
predictors  in  larger  samples.  Perhaps  under  this  condition,  ridge  or 
Bayesian  regression  would  demonstrate  greater  stability  than  equal 
weighting.  Another  weakness  was  the  relatively  small  number  of  samples,  29 
in  Wave  1,  and  22  in  Wave  2.  Future  research  would  ideally  use  more  samples 
of  approximately  equal  sizes  in  order  to  obtain  better  estimates  of  the 
differences  in  stability  between  estimation  methods.  Further  work  should 
also  be  conducted  on  the  validity  of  the  estimated  weights  as  opposed  to  the 
stability.  Sinoe  the  testing  of  behavioral  models  in  correlational  field 
studies  requires  the  Interpretation  of  weights,  more  explicit  statements 
about  the  accuracy  of  weighting  methods  is  desired. 

For  the  present,  it  is  argued  that  because  no  method  demonstrated 
clear-out  superiority  over  equal  weighting,  the  current  state  of  our 
research  methods  does  not  yet  permit  us  to  test  differentially  weighted 


> 
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models  in  small  samples.  We  knew  this  was  true  for  least  squares  weights. 
The  present  findings  suggest  that  it  is  also  true  for  several  other 
advocated  methods. 


This  research  was  supported  in  part  by  the  Office  of  Naval  Research, 
Contract  NOOO-l^-TS-C-OgOU,  Charles  L.  hulin,  principal  investigator 
in  part  by  the  Department  of  Psychology,  University  of  Illinois,  and  in 
part  by  the  Illinois  National  Guard. 

In  order  to  reduce  ambiguity,  the  following  conventions  are  used  for 

the  present  study.  First,  the  term,  equation  validity,  will  be  taken 

to  mean  the  similarity  of  the  estimated  weights  to  the  true  population 

weights.  The  congruence  coefficient  (Korth  and  Tucker,  1975)  used  by 

Dorans  and  Drasgow  (1978)  is  an  example  of  a  statistic  that  Indicates 

equation  validity.  Second,  equation  stability  will  mean  the  predictive 

effectiveness  of  a  set  of  estimated  weights  in  future  samples.  The 

2 

indices  most  commonly  used  are  the  shrunken  R  (e.g.,  Wherry,  1931 )  the 
2  2 

cross-validated  r  (r  ).  The  disadvantage  of  the  former  is  that  it 

cv 

is  meaningful  only  for  least  squares  estimates,  whereas  the  latter  is 
appropriate  for  weights  derived  from  any  procedure.  The  distinction 
between  equation  validity  and  stability  is  made  because  the 
interpretation  of  the  relative  size  of  estimated  weights  implies  that 
the  equation  has  some  validity.  However,  the  estimation  of  stability 
through  cross-validation  may  not  give  much  indication  about  this 
validity.  Darlington  (1968)  and  Walner  (1976)  both  point  out  that 
widely  different  sets  of  weights  can  yield  equally  stable  equations, 
but  of  course  are  not  all  equally  valid.  Previous  researchers  have 
focused  on  the  stability  of  the  equations.  Dorans  and  Drasgow  (1978) 
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did  compute  congruency  coefficients  for  different  weighting  schemes  and 

2 

the  results  were  not  identical  to  those  of  the  r  .  Empirical  studies 

cv 

(non-Monte  Carlo)  have  no  recourse  but  to  use  stability  as  a  criterion 
since  population  weights  are  unknown.  However  studies  such  as  the 
present  one  must  be  interpreted  with  the  distinction  between  validity 
and  stability  in  mind. 

3.  The  present  authors  would  like  to  thank  Jerry  Isaacs  for  making  the 
Bayesian  m-group  regression  program  available.  We  would  also  like  to 
thank  Richard  Darlington  for  making  the  ridge  regression  program 
available. 
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